Method for obtaining dynamic and structural data pertaining to proteins and protein/ligand complexes

ABSTRACT

This invention provides an NMR method for obtaining both entropic and enthalpic data on proteins and protein/ligand complexes which can be used to obtain accurate structural and dynamic data of proteins and protein complexes having a wide range of molecular weights. An embodiment of the invention provides proteins which contain at least one bond vector whose dynamics are to be measured and which is surrounded by NMR inactive nuclei, and amino acids for synthesis of the proteins via chemical means or biological expression. The NMR methods using specifically labeled proteins for analysis result in maximization of the sensitivity and resolution of the NMR experiments, and minimization of the loss of signal due to diffusion.

RELATED CASES

This application is based on and claims priority to U.S. provisionalpatent application No. 60/386,739, filed on Jun. 10, 2002, the entirecontents of which are incorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Technical Field

This application relates to the field of drug design, and in particularto methods for obtaining dynamic and entropic data from specificlocations of proteins and protein/ligand complexes over a wide range ofmolecular weights.

2. Description of the Background Art

The affinity of two molecules for each other is governed by an equationcalled the Gibbs'Free Energy Equation:ΔG=ΔH−TΔSwhere G is the Gibbs' free energy, H is the enthalpic (structural)component and S is the entropic (dynamic) component. If the change inGibbs' free energy is negative, then the molecules will spontaneouslybind. Conversely, if the Gibbs' free energy is positive on binding themolecules will immediately dissociate.

The Gibbs' free energy has two component parts. Either the structural ordynamic component can be dominant in the binding free energy of twomolecules. In other words, a good structural fit between two moleculescan be more than offset by an accompanying entropic penalty or, bycontrast, the effect of a poor fit can be significantly enhanced by afavorable change in entropy on binding. For example, FIG. 1 shows therelative binding affinities of a panel of ligands to mouse urinaryprotein (MUP) at 298 K and the relative contributions of the enthalpicand entropic components of the Gibbs' free energy. Currently scientistsin a number of fields are interested in obtaining a measure of theelusive entropic component of binding free energy, and have attempted todo so using a variety of biophysical methods, including isothermaltitration calorimetry and molecular dynamics simulations. See, e.g.,Schoen, Biochemistry 28:5019-5024, 1989; Chervenak and Toone, J. Am.Chem. Soc. 116:10533-10539, 1994; Dam et al., J. Biol. Chem.273:32812-32817, 1998; Bundle et al., 120:5317-5318, 1998; Williams andBardsley, Perspect. Drug Discov. Design 17:43-59, 1999; Shafer et al.,113:7809-7817, 2000; Caldesone and Williams, J. Am. Chem. Soc.123:6262-6267, 2001; Harris et al., J. Am. Chem. Soc. 123:12658-12663,2001; Clark et al., J. Am. Chem. Soc. 123:12238-12247, 2001; Schafer etal., 43:45-56, 2001.

These phenomena have a significant impact with regard to drug design. Adrug is no more than a chemical entity which binds to a part of thehuman or pathogen biomolecular machinery, thereby impeding or imitatinga biological function. Design of new drug molecules is exceptionallydifficult since the molecule, to be effective, must bind tightly to thedesired specific molecular target yet not bind to any of the vast numberof other molecules of the body. Therefore, ignorance of the entropychange upon binding is an enormous handicap to discovering moleculeswhich specifically and tightly bind to the target of interest.

The pharmaceutical industry has made a major attempt to improve theefficiency of drug design over the historically used methods ofisolating and concentrating plant extracts and the like which were foundto have desirable therapeutic effects, and attempting to improve theactive ingredients through trial and error chemical modifications.Needless to say, these former processes were very slow and inefficient.The new methods include genomics (identifying all the componentmolecular parts of the human body), proteomics (identifying those partsof the biomolecular machinery involved in a given disease condition),combinatorial chemistry (synthesis of huge numbers of compounds whichcan be tested rapidly for desired characteristics using high throughputscreening), and rational drug design (designing drugs on the basis ofhigh resolution structural data of a molecular target of interest).

X-ray crystallography is widely used to obtain an estimate of thestructure of proteins and can provide the complete tertiary structure(global fold) of the backbone of a crystallized protein. This method,however, has several disadvantages. For example, only proteins which canbe crystallized may be studied using X-ray crystallography. Someproteins are very difficult or impossible to crystallize. Moreover,crystallization can be very time consuming and expensive. Another majordisadvantage of this method is that the structural information obtainedmay be pertinent only to the crystalline structure of the protein andnot to the structure of the protein in solution. The bond angles presentin a crystal structure may not be the same as those of the protein whenit is in an active conformation and therefore may not provideinformation relevant to the biological or physiological system ofinterest. Nor can this method provide any information whatsoeverconcerning the entropic component of protein folding or binding of aligand.

High throughput screening of drug candidate compounds for binding of thetarget molecule by definition detects those compounds that have afavorable Gibbs' free energy change on binding to the target. Butscreening can not provide information concerning the enthalpic orentropic components of the Gibbs' free energy change. Conversely, X-raycrystallographic structural data can provide enthalpic data pertainingto the crystallized protein, but no dynamic data because the materialmust be crystallized (rigid) for the technique to work. Thepharmaceutical industry therefore is forced to rely on multiple researchprojects, none of which is capable of supplying complete data needed forsuccessful drug design. Even more importantly, the techniques availablefor use in rational drug design do not provide any information on thecontribution of individual functional groups of protein or of ligand tothe Gibbs' free energy change of binding of a ligand to a protein. Ineffect, drug designers presently must proceed without any information onmany of the components of the biomolecular system that impact on thebinding of the drugs under study. Therefore, a technique that canreliably and rapidly provide both complete enthalpic and entropic dataon proteins would be highly useful.

Protein structure determination by high resolution multinuclear NMR alsohas become well known. In NMR spectroscopy, magnetization of certainatomic nuclei (usually protons) in a powerful magnetic field is detectedby the absorption of radio waves. NMR has become a major tool in thestudy and analysis of small (<1 kD) molecules. For analysis of largermolecules such as proteins, it is essential in most applications toreplace the natural abundance atoms of carbon and nitrogen (¹²C and ¹⁴N)universally with the NMR active stable isotopes ¹³C and ¹⁵N so thatassignment of each of the detected NMR signals can be made reliably. SeeIkura et al., Abstr. Pap. Am. Chem. Soc. 199:97-POLY (1990); Ikura etal., Abstr. Pap. Am. Chem. Soc. 199:107-INOR (1990); Bax, Curr. Opin.Struct. Biol. 4:738-744 (1994). Using isotopic labeling of this type,NMR has been used to determine the structure of several proteins. SeeIkura et al., Biochemistry 30:9216-9228 (1991); Clore and Gronenborn,Nature Struct. Biol. 4:849-853 (1997). Moreover, because NMR can becarried out on a protein in solution, in contrast to X-Raycrystallography, which requires crystalline material, NMR studies canmeasure dynamic parameters. See Kay et al., Biochemistry 28:8972-8979(1989).

In principle therefore, NMR can provide both enthalpic and entropic dataon a drug target such as a protein. The actual use of NMR for thesepurposes, however, has been limited to the study of only relativelysmall molecules. Because the magnetization of the nuclei (¹H, ¹³C, ¹⁵N)in a protein tends to diffuse more easily with increasing molecularweight, the signal-to-noise ratio decreases with the size of themolecule being studied, rendering the data more difficult to interpretas the protein size increases. Thus, structure determinations ofproteins have been restricted to sizes of about 35 kD or less anddynamics studies have been restricted to molecules of about 20 kD orless. In practice, this means that NMR has made only a modest impact todate on the drug design process. In essence this is because the veryisotopes needed to assign the protein NMR signals in the first place,such as ¹³C, allow the magnetization to diffuse. In addition, thesignals being assigned are split into multiplets by neighboring isotopesand this splitting further degrades the signal with respect to noise.

Attempts to ameliorate this degradation, for example using substitutionwith additional isotopes such as deuterium, can add complications due tothe properties of their spin-states (which equals 1 in the case ofdeuterium). Thus measurement of deuterium relaxation, while attractivein view of theoretical simplicity, results in additional signal lossesin NMR studies since only 66% of the signal energy can be transmitted toand from the deuterium nucleus, resulting in 43% efficiency. Muhandiramet al., J. Am. Chem. Soc. 117:11536-11544 (1995).

The difficulties encountered in these NMR structural determinations aredue largely to using proteins for study which are universallyisotopically enriched. Such proteins are used because the labelingsystems that are currently commercially available yield only random,universal enrichment. Such universal labeling yields split signals inthe NMR spectrum, each of which need to be assigned before a structuredetermination can be commenced. Splitting of signals results in bothmore and weaker signals. This phenomenon causes overlap of signals and afar inferior signal-to-noise ratio, both of which make the assignmentprocess more difficult and both of which are greatly increased withprotein size. Therefore, in practice these methods can provide fairlyaccurate structures only of small proteins.

Recently methods have been described which attempt to overcome theseproblems and dramatically increase the resolution and sensitivity of NMRspectra in structural studies of proteins. These methods utilizespecific isotopic enrichment of the protein in the backbone only, whichgreatly facilitates both the detection and assignment of the NMR signalsand the calculation of the structure of the protein. (Coughlin et al.,J. Am. Chem. Soc. 121, 11871-11874, 1999; Giesen et al., J. Biomol. NMR19, 255-260, 2001; Giesen et al., J. Biol. NMR., pp. 1-9, 2002.

Several studies have been undertaken to develop NMR methods for study ofprotein dynamics. These studies on entropic contribution to binding havefocused on side-chain methyl groups in protein. These studies have usedeither ¹³C or ²H relaxation measures in the ¹³CH²D or ¹³CHD₂ isotopomersrespectively. Lee et al., Nature Str. Biol. 7:72-77, 2000; Muhandiram etal., J. Am. Chem. Soc. 117:11536-11544, 1955; Yang et al., J. Mol. Biol.276:939-954, 1998; Mittermaier et al., J. Biomol. NMR 13:181-185, 1999;Lee et al., J. Am. Chem. Soc. 121:2891-2902, 1999; Ishima et al., J. Am.Chem. Soc. 121:11589-11590, 1999; Ishima et al., J. Am. Chem. Soc.123:6164-6171, 2001; Skrynnikov et al., J. Am. Chem. Soc. 123:4556-4566,2001.

However, as with the structural studies, there are sensitivity problemswith all the approaches tried so far which have limited the measurementof dynamics and entropy to very small proteins. Although compellingreasons exist to choose ²H relaxation measurements where possible, inpractice ¹³C relaxation measurements are considerably more sensitive.Muhandiram et al., J. Am. Chem. Soc. 117:11536-11544, 1995; Lee et al.,J. Am. Chem. Soc. 121:2891-2902, 1999. Indeed, in studies using themouse major urinary protein (MUP) as a model system for thermodynamicsof ligand-protein interactions, considerable difficulty was experiencedin measuring accurate ²H relaxation rates for valine methyl groups inmethyl ¹³C, 50% ²H-enriched protein due to the combined effects ofresonance overlap and relatively poor sensitivity, despite the onlymodest size of this protein (˜19 kD). The low sensitivity derived fromthe fact that only the ¹³CH₂D isotopomer was detected in ²H relaxationmeasurements, whereas the sample contained all possible ²H isotopomers,even with optimized labeling schemes. See Ishima et al., J. Biomol. NMR21:167-171, 2001. Consequently, the effective protein concentration inthe sample was reduced considerably.

Therefore, because sensitivity is especially important in largerproteins, ¹³C relaxation measurements are the only viable approach forsuch molecular systems. ¹³C relaxation studies on side-chain methylgroups in proteins typically have involved ¹³CHD₂ isotopomers, where the¹³C relaxation mechanism is particularly straightforward in the presenceof a single proton. While such isotopomers can be obtained in proteinsoverexpressed in bacteria by use of conventional ¹³C-enriched andfractionally deuterated media, again all possible ²H isotopomers areobtained in these systems. Thus the desired isotopomer (¹³CHD₂) isdiluted by the undesirable ones (¹³CH₂D, ¹³CH₃, ¹³CD₃). This results ina loss of both resolution and sensitivity, which becomes particularlysevere for even modest-size proteins, for example proteins of 20 KD orgreater.

Additional problems can result from attempting to measure ¹³C relaxationrates in partially deuterated proteins with multiple isotopomers. Thus,while ¹³C relaxation measurements on the MUP system cited above resultedin much higher sensitivity, resonance overlap was very severe, due inpart to the presence of contaminating resonances from ¹³CH₃ isotopomersin refocused INEPT studies designed to selectively detect resonancesfrom ¹³CHD₂ isotopomers. The ¹³CH₃ isotopomers resonate at a differentchemical shift from the ¹³CHD₂ isotopomers due to the deuterium isotopeeffect, and are incompletely suppressed in refocused INEPT studies as aconsequence of the different relaxation rates in proteins of the 3/2 and1/2 spin manifolds of the ¹³C spin in CH₃ groups.

Thus, to overcome resonance overlap difficulties of this type and theassociated lack of sensitivity, a new labeling strategy heretoforeunavailable is required to measure the dynamics and associated entropyvalues for apo-proteins and proteins complexed with ligands such as drugcandidates. Therefore, there is a need for a technology platform thatcan provide both enthalpic and entropic data on a drug target such as aprotein and particularly such proteins bound to a ligand such as a drugor drug candidate. There also is a need to provide methods for NMRspectroscopy to achieve significant enhancement of the size range overwhich dynamic data can be obtained for proteins and their complexes withdrug candidates. A system that can do so on a functional group byfunctional group basis as an aid to drug design would be particularlyuseful.

SUMMARY OF THE INVENTION

Accordingly, embodiments of the invention provide methods and materialswhich can be used to obtain dynamic data on proteins and protein/ligandcomplexes over a wide range of molecular weights (such as for example 10kD to 150 kD or more), including membrane proteins and multi-proteincomplexes. In particular, embodiments of the invention provide methodsfor significantly enhancing the sensitivity and resolution ofmeasurements of dynamics of a protein and protein/ligand complexes usingNMR spectroscopy which allows information to be obtained on largeproteins and protein systems and which allows the contribution of singlefunctional groups with respect to binding to be examined.

One aspect of the invention provides proteins which contain at least oneisotopically labeled bond vector the dynamics of which are to bemeasured and which is surrounded by NMR inactive bond vectors. In thisway the sensitivity and resolution of the NMR experiments is maximized,while the loss of signal due to diffusion is minimized.

Another aspect of the invention provides methods for preparing andanalyzing proteins which are composed essentially completely of ¹²C, ¹⁴Nand ²H nuclei, save for isolated ¹³C—H vectors at positions in the sidechains of amino acids in the protein which it is desired to study byNMR.

A further aspect of the invention provides methods for preparing andanalyzing proteins which are composed essentially completely of ¹²C, ¹⁴Nand ²H nuclei, save for isolated ¹⁵N—H vectors at positions in the sidechains of amino acids which it is desired to study by NMR.

A particularly preferred aspect of the invention provides a proteinwhich is composed essentially completely of ¹²C, ¹⁴N and ²H nuclei, savefor isolated ¹³C—H vectors at positions in the side chains of aminoacids in the protein which it is desired to study by NMR. In anotherparticularly preferred aspect of the invention, a protein is providedwhich is composed essentially completely of ¹²C, ¹⁴N and ²H nuclei, savefor isolated ¹⁵N—H vectors at positions in the side chains of aminoacids in the protein which it is desired to study by NMR.

A further aspect of the invention provides methods for the chemicalsynthesis of amino acids which are composed completely of ¹²C, ¹⁴N and²H nuclei, save for isolated ¹³C—H vectors at positions in the sidechains of the amino acids which it is desired to study by NMR.

A further aspect of the invention provides methods for the chemicalsynthesis of amino acids which are composed completely of ¹²C, ¹⁴N and²H nuclei, save for isolated ¹⁵N—H vectors at positions in the sidechains of the amino acids which it is desired to study by NMR.

Yet another aspect of the invention provides methods for the culture ofcells in media containing specifically labeled amino acids, whichprovide for the prevention of isotopic scrambling.

Embodiments of the invention provide an amino acid wherein at least onebond vector in the side chain of the amino acid consists of twoNMR-active nuclei bonded together and wherein essentially all othernuclei are NMR inactive. Embodiments of the invention further provide anamino acid wherein at least one bond vector in the side chain of theamino acid consists of two NMR active nuclei bonded together and whereinessentially all other bond vectors are NMR inactive.

Embodiments of the invention further provide an amino acid as describedabove wherein the two NMR-active nuclei are ¹³C and ¹H, wherein theremainder of the carbon atoms in the amino acid are essentially ¹²C,wherein the nitrogen atoms in said amino acid are essentially ¹⁴N andwherein the remainder of the hydrogen atoms in the amino acid areessentially ²H. Embodiments of the invention also provide an amino acidas described above wherein the two NMR-active nuclei are ¹³C and ¹H,wherein the remainder of the carbon atoms in the amino acid areessentially ¹²C, wherein the nitrogen atoms in said amino acid areessentially ¹⁴N, wherein the carbon atoms in the amino acid areessentially ¹²C and wherein the remainder of the hydrogen atoms in saidamino acid are natural abundance. Embodiments of the invention furtherprovide an amino acid as described above wherein the two NMR-activenuclei are ¹⁵N and ¹H, wherein the remainder of the nitrogen atoms inthe amino acid are essentially ¹⁴N, wherein the carbon atoms in theamino acid are essentially ¹²C and wherein the remainder of the hydrogenatoms in the amino acid are essentially ²H. Embodiments of the inventionalso provide an amino acid as described above wherein the two NMR-activenuclei and ¹⁵N and ¹H, wherein the remainder of the nitrogen atoms insaid amino acid are essentially ¹⁴N, wherein the remainder of the carbonatoms in the amino acid are essentially ¹²C and wherein the remainder ofthe hydrogen atoms in the amino acid are natural abundance.

Embodiments of the invention also provide a culture medium suitable forgrowth of protein-producing cells, such as bacteria, yeast, mammal orinsect cells that comprise amino acids as described above and proteinsthat comprise at least one amino acid as described above.

Further, embodiments of the invention provide a method of analyzing thedynamics of a bond vector of a protein comprising producing the proteinin a form which comprises an amino acid as described above andsubjecting the protein to NMR spectroscopy. Another embodiment of theinvention provides a method of determining the entropic contribution ofa bond vector of a protein bound to a ligand comprising producing theprotein in a form which comprises an amino acid as described above andsubjecting the protein to NMR spectroscopy in the presence and theabsence of the ligand.

Embodiments of the invention further provide a method of preparing anisotopically substituted protein comprising culturing cells that expressthe protein in a medium containing at least one amino acid as describedabove and recovering the protein from the culture medium or from thecells.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 provides the relative binding affinities and Gibbs' free energycomponents of a panel of ligands to mouse urinary protein at 298 K.

FIG. 2 provides an example for the preparation of a precursor used inthe synthesis of specifically labeled valine.

FIG. 3 provides an example for the preparation of specifically labeledamino acid.

FIG. 4 is an example of preparation of specifically labeled amino acids.

FIG. 5 is an example of preparation of intermediates for the synthesisof specifically labeled amino acids.

FIG. 6 provides a synthetic scheme for chemical synthesis ofL-valine-α-D-¹²CD(¹³CHD₂)₂.

FIG. 7 is an SDS-PAGE gel (lane 1: molecular weight standards; lane 2:non-induced MUP; lane 3: induced MUP; lane 4: insoluble cell breakpellet; lane 5: Ni-NTA column flow-through; lane 6: Ni-NTA columnelution; lane 7: anion exchange fraction 1; lane 8: anion exchangefraction 2.

FIG. 8 shows NMR results for mouse urinary protein expressed withL-valine-α-D-¹²CD(¹³CHD₂)₂.

FIG. 9 shows a region of the ¹³C—¹H HSQC spectrum of methyl-¹³C, 50%-¹²Henriched MUP showing valine methyl correlations (9A); an equivalentspectrum of MUP selectively enriched with (¹³C^(γ1γ2)HD₂)₂¹²C^(α,β)D-L-valine (9B); and an overlay of regions of the ¹³C—¹H HSQCspectra of (¹³C^(γ1γ2)HD₂)₂ ¹²C^(α,β)D-L-valine enriched MUP in complexwith 2-methoxy-3-isopropylpyrazine (black correlations) and2-methoxy-3-isobutylpyrazine (gray correlations) (9C).

FIG. 10 provides typical relaxations curves for Val-12C^(γ1) obtainedfrom (¹³C^(γ1γ2)HD₂)₂ ¹²C^(α,β)D-L-valine enriched MUP; solid line: freeprotein; dashed line: 2-methoxy-3-isopropylpyrazine complex; dash-dottedline: 2-methoxy-3-isobutylpyrazine complex. For clarity, only the datapoints corresponding to the free protein are shown. Error bars aresmaller than the symbols used for these data points.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Specific labeling of a protein in the backbone has been shown to be veryeffective in providing high resolution and sensitive data to expeditethe assignment of the backbone signals of a protein (Coughlin et al., J.Am. Chem. Soc. 121, 11871-11874, 1999; Giesen et al., J. Biomol. NMR 19,255-260, 2001; and also the determination of the Global Fold (Giesen etal., J. Biol. NMR, pp. 1-9, 2002). In a like manner, embodiments of thepresent invention provide for specific labeling in the side chains ofamino acids in a protein or protein/ligand complex, thereby increasingthe sensitivity and resolution of NMR studies to determine the dynamicsof relevant amino acid sidechains. Embodiments of the present inventionfurther provides methods to produce amino acids that contain a pure¹³CH₂ isotopomer, e.g. (¹³C^(γ1γ2)HD₂)₂ ¹²C^(α,β)D-L-valine, the mostsensitive isotopomer possible. Moreover, embodiments of the presentinvention provide methods for the isotopomer to be contained in an NMR“invisible” environment, thereby maximizing resolution and increasingsensitivity still further. Protein is prepared by including the desiredamino acid in an appropriate bacterial, yeast, insect or mammaliangrowth medium for growth of cells that express or overexpress thedesired protein. The resulting protein contains isotopically enrichedatoms only in the desired species of amino acid, e.g., valine or lysine,thereby maximizing the resolution and sensitivity of the NMR studies.

The invention provides a means for rapidly determining by NMR thedynamics of the sidechain of a protein, or protein/drug complex, of asize considerably larger than heretofor. The term “dynamics” refers tothe entropic component of particular atoms or bond vectors in a proteinwith respect to three dimensional structure and information of theprotein, with or without binding of a ligand. The invention allows thisinformation to be obtained by increasing the resolution of signals ofinterest from one or more pairs of atoms in one or more amino acids inthe protein while simultaneously reducing the tendency of themagnetization of these atoms in an NMR study to diffuse. This isaccomplished by specifically labeling one or more of the amino acids inthe protein in the side chain with one or more pairs of NMR activenuclei (such as ¹H, ¹³C or ¹⁵N, for example) that are covalently bondedtogether. All other atoms of both the side chain and the backbone of theamino acid preferably are selected to minimize sensitivity losses and toincrease resolution. Preferably, the bonded pairs of NMR active atoms inthe labeled amino acid are ¹H and ¹³C or ¹H and ¹⁵N, such as thosecontained in —¹³CHD₂-, —¹³CHD-, —¹³CH— and —¹⁵NH— groups and all othernuclei in the protein are essentially ¹²C, ¹⁴N and D (²H, deuterium).The methods and compositions of embodiments of this invention areisotopically substituted or enriched so that a single bond vector iscomposed of two NMR active nuclei when all other bond vectors are NMRinactive. Preferably, all other atoms are NMR inactive.

The term “active,” when referring to NMR-active nuclei is used accordingto the common usage in the art of NMR studies. Natural abundance refersto the isotopes of an atom that occur in nature. One of skill in the artwill recognize that atoms do not exist in a single isotope in nature,and therefore that an atom such as carbon, for example will exist as ¹²Cfor the most part, but also will exist to a certain degree as ¹³C,naturally. Therefore a carbon-containing molecule that is unlabelednevertheless will contain a small amount of other isotopes as well.Thus, a carbon position in a molecule that is essentially ¹²C contains¹²C in the same or essentially the same ratio (abundance) as occurs innature. Other atoms such as nitrogen and hydrogen also occur naturallyas different isotopes and therefore the terms “natural abundance” and“essentially” may be understood an analogous fashion with respect to anyatom. One of skill in the art also will recognize that an isotopicallysubstituted atom also is not 100% of the stated isotope but rather isenriched in the stated isotope. The term enriched refers to an isotopethat is greater than natural abundance, up to about 5-100%, preferablyabout 5-20% or about 10-20% or most preferably about 10%.

Although the methods of this invention are suitable for thedetermination of dynamic information of any peptidic molecule of threeor more amino acids in length, and therefore encompasses both proteinsand peptides, the description, for simplicity, will refer only toproteins. It is understood that the term “protein,” as used in thisapplication, refers to any peptidic molecule of three or greater aminoacids, or, for example, peptides and proteins of about 5 kD or greatermolecular weight. The compositions and methods of the present inventiontherefore advantageously may be employed in connection with proteinshaving molecular masses of about 5 kD or more, or proteins of about 50amino acid residues or more. The methods are particularly useful forproteins of 20-30 kD or larger, which have been difficult to study usingprior art methods, and even more particularly proteins of 50 or 55 kD ormore or 75 kD, or proteins of 100 kD or longer. Therefore any protein of50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150 kD or more, orcomplexes of such proteins, are suitable for structural and dynamicinformation determinations according to embodiments of this invention.The methods may be used to study membrane proteins as well. Of course,smaller proteins and peptides may be studied using the inventivemethods, including oligopeptides and any peptide of three or greateramino acids. Proteins containing the specifically labeled amino acidsmay be chemically synthesized from scratch or beginning with naturalamino acids, or expressed by cells in culture, for example by bacterial,yeast, mammalian or insect cells.

The amino acids of the protein are labeled at specific positions withany combination of the NMR-relevant isotopes ²H, ¹³C and/or ¹⁵N suchthat only those atoms required to be visible in the spectrum aredetected. Those skilled in the art will recognize that a key steprequired in the elucidation of protein dynamics by NMR is themeasurement of the rate of decay of magnetization from a bond vector,such as for example a C—H or a N—H bond vector. The vector to bemeasured is labeled with ¹³C or ¹⁵N, to form a ¹³C—H or ¹⁵N—H bondvector in the amino acid or acids which are desired to be analyzed. Anybond vector can be specifically labeled with an appropriate isotope,such as the NMR-active isotopes ¹H, ¹³C, ¹⁵N, ¹⁷O or any other necessaryisotope, while the remainder of the bond vectors are NMR inactive andconsist of ²H(D), ¹²C, ¹⁴N, ¹⁶O, etc. This is in contrast to earliermethods where amino acids where labeled either universally by isotopetype, e.g. with commercially available ¹⁵N₂-lysine (Cambridge IsotopeLaboratories) or were partially but randomly labeled during proteinsynthesis. (Rosen, M. K., Gardner, K. H., Willis, R. C., Parris, W. E.,Pawson, T., and Kay, L. E. (1996) J. Mol. Biol. 263, 627-636. Gardner,K. H., Rosen, M. K., and Kay, L. E. (1997) Biochemistry 36, 1389-1401.Gardner, K. H., and Kay, L. E. (1997) J. Am. Chem. Soc. 119, 7599-7600).

The rate of decay of magnetization is an inverse function of the dynamicenergy of the bond vector. The bond vector to be labeled for analysis byNMR may be any bond vector of interest in the protein. However, forpreference the bond vector of choice often will be near, and preferablyat, the terminus of an amino acid side chain for maximum sensitivitywhen ligand-binding areas of a protein are to be studied.

It will further be appreciated by those skilled in the art that many ofthe side chains of amino acids are composed of (CH₃)—, and —(CH₂)—groups. Therefore, for maximum sensitivity of a particular C—H bondvector in such a group, the other protons covalently attached at thatgroup are preferably replaced with deuterium atoms such that the C—Hbond vector of interest is isolated and its retention of magnetizationenhanced. Therefore, it is particularly preferred that the C—H bondvectors desired to be studied are labeled as follows: —¹³CHD₂-, —¹³CHD-and —¹³CH—. Preferably, all other carbon, nitrogen, and hydrogen atomsof the amino acids of the protein are NMR inactive (e.g. ¹²C, ¹⁴N, and D(deuterium)). In another particularly preferred embodiment of theinvention, the isolated bond vectors for study are ¹⁵NH or ¹⁷OH, withall other atoms of the amino acids of the protein NMR inactive,analogous to the strategy described above.

Amino acids have been chemically synthesized in unlabeled forms byvarious means and some have been synthesized in specificallyisotopically labeled forms (for reviews see Martin, Isotopes Environ.Health Stud, 32, 15, 1996, Schmidt, Isotopes Environ. Health Stud, 31,1995, 161). Thus Ragnarsson et al. J. Chem. Soc. Perkin Trans 1: 2503,1994) synthesized 1,2-¹³C₂, ¹⁵N Ala, Phe, Leu, Tyr, 1,2-¹³C₂,3′,3′,3′-²H₃, ¹⁵N Ala and 1,2-¹³C₂, 3′,3′-²H₂, ¹⁵N Phe and 3′,3′,3′-²H₃Ala. Ragnarsson (17) also synthesized 1,2-¹³C₂, 2-²H, ¹⁵N Ala, Leu andPhe and 1,2-¹³C₂, 2,2-²H₂, ¹⁵N Gly, these were partly used forconformational studies of a pentapeptide, Leu-enkephalin. Unkefersynthesized ¹⁵N labeled Ala, Val, Leu, Phe as well as 1-¹³C, ¹⁵N Val.More recently, methods for the preparation of backbone labeled aminoacids have been developed and these have been used for the assignment ofbackbone signals as well as to determine the backbone Global Fold of aprotein. In all these cases, stereo-selective addition of theappropriate amino acid sidechain was added to glycine derivatized in achiral complex.

Amino acid precursors for selective protonation of certain ¹³C-labeledamino acids in proteins during cell culture have been described. Thesematerials, ¹³C₄-3,3-²H₂-α-isobutyrate and ¹³C₅-3-²H-α-isoketovalerate,have been used to produce ¹³CH₃-methyl leucine, isoleucine and valineresidues in a perdeuterated protein expressed in bacteria. Rosen et al.,J. Mol. Biol. 263, 627-636, 1996; Gardner et al., Biochemistry 36,1389-1401, 1997; Gardner et al., J. Am. Chem. Soc. 119, 7599-7600, 1997.Very recently, unlabeled α-isobutyrate and α-isoketovalerate containingterminal ¹³CH₃-groups also have been described for much the samepurpose. Hajduk et al., J. Am. Chem. Soc. 122, 7898-7904, 2000. Allthese methods and materials have increased the scope of analysis ofproteins by NMR, however these labeling methods and the NMR analyticalprocedures they allow do not provide the maximum possible sensitivity ofthe NMR analysis, nor are they applicable for every possible proteintype. Moreover, not all the labeling patterns are applicable for everyamino acid type. For instance, ¹³CH₃-methyl leucine, isoleucine andvaline residues in a perdeuterated environment is of value only tostudies involving those residue types or those close to them. Otherresidues, including other hydrophobic species such as phenylalanine ofpotential interest to binding studies, cannot be studied in this way.

Therefore, a particularly preferred aspect of the present invention is amethod for the synthesis of all twenty amino acids specifically labeledin the side chain with (¹³CHD₂)- and/or —(¹³CHD)- and/or —(¹³CH)— and/or—(¹⁵NH)-moieties, all other parts of the amino acids of the proteinbeing essentially in the form of the nuclei ¹²C, ¹⁴N and deuterium.Amino acids specifically labeled in this way may be synthesized byasymmetric synthesis from glycine such as those cited above using anappropriately isotopically labeled sidechain precursor. Precursors suchas ¹³CHD₂-labeled methyl iodide are available commercially. Preferablytherefore, the amino acids are synthesized from glycine using side chainprecursors themselves prepared from specifically labeled precursors suchas ¹³CHD₂-labeled methyl iodide.

As noted, many methods for the synthesis of amino acids from glycinehave been described (Duthaler, Tetrahedron 50, 1539, 1994; Schöllkopf,Topics Curr. Chem., 109(65): 1983; Oppolzer, Tett. Letts., 30:6009,1989; Helvetica Chimica Acta, 77:2363, 1994; Helvetica Chimica Acta75:1965, 1992) and these may be used in accordance with the presentinvention. In a preferred aspect of the invention, however, glycine isfirst converted to a nickel II transition metal complex according to themethods of Belokon et al. (J. Chem. Soc. Perkin. Trans. 1: 1525-1529,1992). The derivatized glycine then is alkylated by treatment with abase, such as sodium hydroxide, sodium methoxide or preferably,potassium t-butoxide, followed by addition of the appropriate sidechainprecursor. The precursor is an alkyl chain containing a ¹³C—¹H and/or¹⁵N—H label where required and ¹²C, ¹⁴N and ²H in all other positions,bonded to chemical leaving group such as bromide, iodide,4-nitrobenzenesufonate, etc. at the appropriate position.

Precursors of this type can be readily synthesised from commerciallyavailable materials. Thus, (¹³CHD₂)₂—CD-iodide, the desired precursorfor specifically labeled valine, can be prepared from ¹³CHD₂-labeledmethyl iodide via a Grignard reaction with magnesium and deuteratedethyl formate, and halogenation of the resulting specifically labeledisopropyl alcohol. See FIG. 2. This specifically labeled sidechain thencan be added to glycine derivatized as a chiral complex with the desiredspecifically labeled valine being obtained via acid hydrolysis. See FIG.3.

Alternatively, the specifically labeled isopropanol can be oxidized tothe correspondingly labeled acetone and the carbon chain extended bytreatment with a methylene donor such as dimethyloxosulfonium methylideto yield the required precursor for specifically labeled leucine. SeeFIG. 4. Addition of the iodide to the glycine complex as above yieldsspecifically ¹³CHD₂-labeled leucine.

Alternatively, ¹³CHD₂-iodide can be reacted with protected beta-mercaptoethanol shown in FIG. 5. Reaction of the specifically ¹³CHD₂-labeledthio ether with the glycine complex and subsequent acid hydrolysisyields methionine. In this way, specifically labeled sidechains of allthe alkyl amino acids can be constructed.

The following examples are provided to illustrate and not to limit theinvention claimed herein.

EXAMPLES Example 1 Synthesis of L-(¹³C^(γ1γ2)HD₂)₂ ¹²C^(α,β)D-L-Valine

Magnesium turnings (6.08 g, 250.00 mmol, 2.50 equiv.) and anhydrousether (100 mL) were added into a 3-neck 500 ml round bottom flaskequipped with condenser, mechanical stirrer and heating mantle. Themixture was stirred and heated until under gentle reflux. ¹³CHD₂-I(Cambridge Isotope Labs, 28.99 g, 200.00 mmol, 2.00 equiv.) in anhydrousether (50 mL) was added dropwise into the Mg/ether mixture over 30minutes and refluxing was continued for another 2 hours with the heatingmantle to form a Grignard reagent. The reaction was then cooled in anice bath.

D-CO—OCH₃ (Cambridge Isotope Labs), 6.11 g, 100.00 mmol, 1.00 equiv.) inanhydrous ether (50 mL) was added slowly into the Grignard reagent over15 minutes. The ice bath was removed and the reaction mixture wasstirred for another 4 hours. The reaction mixture then was cooled againin an ice bath and saturated aqueous NH₄Cl solution (35 mL) was addedslowly over 15 minutes to quench the Grignard reaction. The reactionmixture was filtered and the solid was rinsed with ether (2×100 mL). Thecombined ether solutions were dried (MgSO₄) and filtered. The filtratewas distilled slowly and carefully through a 10-inch column containingglass helices. When the temperature of the distillate reached 40° C.,the remaining liquid was transferred into a 15 mL pear-shaped flask anddistilled through a 1-inch Vigreaux column to give (¹³CHD₂)¹²CD-iodide(1.46 g, 21.76 mmol, 21.76% yield, F.W.=67.11; bp 78-82° C.).

BPB-Ni(II)-Gly red complex (7.22 g, 14.49 mmol, 1.00 equiv.), preparedessentially by the method of Belokon et al., was suspended in anhydrousCH₃CN (200 mL) at room temperature. (¹³CHD₂)¹²CD-iodide (2.83 g, 15.99mmol, 1.10 equiv.) in anhydrous CH₃CN (10 mL) was added to the redreaction suspension, followed after 5 minutes by NaO^(t)Bu (1.54 g,16.02 mmol, 1.11 equiv.). After 4 hours, thin layer chromatography(silica gel, acetone/CHCl₃=1/5) revealed the presence of a trace ofunreacted starting material and a major spot with a higher R_(f) value.Therefore, glacial acetic acid (3.84 g, 3.66 mL, d=1.049, 63.95 mmol,4.41 equiv.) was added to quench the reaction.

The reaction mixture was concentrated under reduced pressure and pouredinto a 2 L Erlenmeyer flask containing H₂O (1 L). The mixture wasextracted with CH₂Cl₂ (3×200 mL) and the combined organic layers werewashed with H₂O (2×200 mL) and then brine solution (200 mL). The organicphase was dried (MgSO₄) and evaporated to provide a red crude foamyglass. The crude product was subjected to further purification by flashcolumn chromatography on silica gel using chloroform:acetone as eluant.The appropriate fractions were combined and evaporated to dryness. Theresidue was dissolved in 1:1 toluene:MeOD (Isotec, 100 ml), treated withsodium metal (200 mg) and the whole heated to reflux overnight. Oncooling, the reaction mixture was treated with deutero-acetic acid(Cambridge Isotope Labs, 1 mL) and concentrated under reduced pressure.The mixture was extracted with CH₂Cl₂ (3×200 mL) and the combinedorganic layers were washed with H₂O (2×200 mL) and then brine solution(200 mL). The organic phase was dried (MgSO₄) and evaporated. Theresulting red foamy glass was dissolved in methylene chloride (10 ml)and added dropwise to stirred hexane (2 L). The suspension was stirredovernight, filtered and the collected solid dried to provide(¹³C^(γ1γ2)HD₂)₂-¹²C^(α,β)D-L-valine-Ni(II)-BPB (4.32 g, 7.89 mmol,54.45% yield).

This complex (4.32 g, 7.89 mmol, 1.00 equiv.), CH₃OH (60 mL), and 2 MHCl (60 mL) were heated at reflux for 10 minutes. The pale greensolution was evaporated to dryness on a rotary evaporator. H₂O (50 mL)was added to the dried solid. The mixture was cooled in an ice bath forseveral hours, followed by filtration. Thin layer chromatography (silicagel, BuOH/acetic acid/water=2/1/1) of the filtrate showed the presenceof the title compound (¹³C^(γ1γ2)HD₂)₂-¹²C^(α,β)PD-L-valine. Thefiltrate was dried and the title compound was isolated by ion exchangechromatography and crystallized from aqueous ethanol (0.74 g, 5.91 mmol,F.W.=125.16, 75% yield from BPB-Ni(II)-glycine). See FIG. 6.

Example 2 Expression and Purification of Murine Urinary Protein (MUP)Containing L-valine-α-D-¹²CD(¹³CHD₂)₂

A 20 mL stock culture of M15 cells transformed with the vector pqe30 MUPwas used to inoculate 1 L of medium containing 500 mg alanine, 400 mgarginine, 400 mg aspartic acid, 50 mg cysteine, 400 mg glutamine, 650 mgglutamic acid, 550 mg glycine, 100 mg histidine, 230 mg isoleucine, 230mg leucine, 420 mg lysine HCl, 250 mg methionine, 130 mg phenylalanine,100 mg proline, 2.1 g serine, 230 mg threonine, 170 mg tyrosine, 230 mgvaline, 500 mg adenine, 650 mg guanosine, 200 mg thymine, 500 mg uracil,200 mg cytosine, 1.5 g sodium acetate (anhydrous), 1.5 g succinic acid,750 mg NH₄Cl, 850 mg NaOH, 10.5 g K₂HPO₄ (anhydrous), 2 mg CaCl₂ 2H₂O, 2mg ZnSO₄ 7H₂O, 2 mg MnSO₄H₂O, 50 mg tryptophan, 50 mg thiamine, 50 mgniacin, 1 mg biotin, 20 g glucose, 4 mL 1 M MgSO₄, 1 mL 0.01 M FeCl₃, 15mg ampicillin, and 50 mg kanamycin.

When cell density had reached an OD of 1.2, the cells were harvested bycentrifugation, rinsed with PBS, recentrifuged and resuspended in amedium of the above proportions but in which L-valine-α-D-¹²CD(¹³CHD₂)₂was substituted for the unlabeled valine. After 30 minutes, proteinexpression was induced by addition of IPTG to a final concentration of0.1 mmol. After 6 hours, the cultured cells were centrifuged at 4000 rpmfor 20 minutes in a Sorvall RC-3B centrifuge. The cell pellet was thenstored at −20° C. overnight. The cells were thawed and resuspended in 30ml of sonication buffer (50 mM sodium phosphate, 500 mM NaCl, pH=8.0).The cells were broken by passing them through a French Press four timesat 20,000 psi. The broken cells were subjected to sedimentation at15,000×g for 20 minutes in Oakridge tubes.

A 5 mL Ni-NTA immobilized metal affinity column was equilibrated insonication buffer at 5 mL/min. The supernatant (cell lysate) was removedfrom the Oakridge tube without disturbing the pellet. Cleared lysate wasloaded onto the column at 5 ml/min. The column flow-through was savedfor later analysis. The material bound to the column was then washedwith sonication buffer for 30 minutes until the absorbance of theeffluent was less than 0.020. Bound protein was eluted from of thecolumn with Elution Buffer (50 mM sodium phosphate, 500 mM NaCl, 500 mMimidizole, pH=8.0) using a single step. The peak fraction was collectedmanually. A sample of the elution was saved for analysis.

A 300 mL XK-50 column packed with Sephadex G-25 Fine size exclusionchromatography (SEC) resin was equilibrated with anion exchange Buffer A(20 mM Tris HCl, pH=7.5). The Ni-NTA eluate was loaded onto the SECcolumn at 15 ml/min. The protein peak was collected manually. Theprotein sample, now in anion exchange Buffer A, was stored at 4° C.during preparation of the next step. A 10 ml Resource Q anion exchangecolumn was equilibrated in Buffer A. The partially purified protein wasloaded onto the column at 10 mL/min. The sample was washed with Buffer Afor three minutes. The sample was eluted with a linear gradient intoBuffer B (20 mM Tris HCl, 1 M NaCl, pH=7.5) over seven minutes. Thefractions were collected in 30 second intervals. A sample of eachfraction was set aside for analysis.

The material was analyzed by SDS-PAGE using a 12% Tris/Glycine gel at aconstant 200 volts for 45 minutes. Results are shown in FIG. 7. The purefractions (lanes 7-8) were loaded into a 3000 MWCO Slidalyzer™ dialysiscassette. The protein was dialyzed at 4° C. into 50 mM sodium phosphatepH 7.0. Two buffer changes ensured complete removal of the Tris buffer.The final protein concentration was determined using UV absorbance at280 nm; comparing it to the extinction coefficient for MUP (0.503 at 1mg/mL). The final concentration of pure Murine Urinary Protein was 48mg/mL in 1.9 mL. Total yield was 90 mg. The final MUP sample was storedat 4° C. prior to NMR analysis.

Example 3 NMR Analysis of Murine Urinary Protein (MUP) ContainingL-valine-α-D-¹²CD(¹³CHD₂)₂

A 15 mg sample of L-valine-α-D-¹²CD(¹³CHD₂)₂ labeled MUP was dissolvedin 650 μL phosphate buffered saline (10 mM potassium phosphate; 200 mMsodium chloride), to which was added 50 μL deuterium oxide. ¹³CRelaxation rates (R₂ and R₁) of ¹³CHD₂ groups were determined usingpulse sequences as described in Ishima et al., J. Am. Chem. Soc.121:11589-11590 (1999). Spectral parameters were as follows: spectralwidth in the ¹³C dimension, 900 Hz; spectral width in the ¹H dimension,5200 Hz; number of real data points in the ¹³C dimension, 128; number ofreal points in the ¹H dimension, 1408; number of transients per t₁increment, 8; probe temperature, 298 K. Prior to two-dimensional Fouriertransformation, free induction decays were apodized with cosine-bellwindowing functions according to known methods.

The NMR analysis was then repeated following addition of the smallmolecule ligand 1 μL of isobutyl pyrazine or 1 μL of isopropyl pyrazineto separate solutions of MUP. The data shown below in Tables I and IIand in FIG. 8 were obtained, clearly showing the changes in T1 and T2values obtained on addition of these small molecule ligands. Therelaxation data obtained in the presence of the ligands2-methoxy-3-isopropylpyrazine and 2-methoxy-3-isobutylpyrazine indicatesthat valine side-chains do not contribute significantly to the entropyof binding. TABLE I Valine (-13CHC2) Labeled Mouse Urinary Protein T1and T2 Relaxation Times in the Absence and Presence of IsobutylpyrazineLigand. Bound Protein Error Protein Error Change T1 values(milliseconds) 12.HG1 1724.922 26.602 1760.141 85.914 35.219 47.HG11628.719 23.207 1619.169 47.052 9.550 47.HG2 990.621 25.903 1072.69045.334 −82.069 53.HG1 2233.271 38.470 2332.247 82.853 −98.976 53.HG21364.726 21.978 1382.566 58.719 17.840 59.HG1 1282.763 21.852 1367.07567.324 −84.312 59.HG1 1532.137 36.468 1502.825 69.414 20.312 70.HG12081.572 27.198 1701.953 57.518 379.619 70.HG2 1163.528 15.853 1488.13082.016 −324.602 82.HG1 914.901 15.870 913.641 27.002 1.260 82.HG2669.801 11.195 703.150 26.093 −33.349 T2 values (milliseconds) 12.HG1153.108 2.257 166.608 5.221 −13.500 12.HG2 183.314 2.276 200.110 6.007−16.796 47.HG1 155.232 1.994 171.287 5.421 −16.055 47.HG2 150.754 3.469162.950 5.232 −12.196 53.HG1 163.699 2.187 174.600 4.773 −10.901 53.HG2150.283 2.148 163.379 4.812 −13.096 59.HG1 178.648 2.787 200.992 7.122−22.344 59.HG1 147.716 3.338 150.756 6.220 −3.040 70.HG1 121.202 1.399186.968 5.222 −65.766 70.HG2 162.033 2.083 128.164 5.129 33.869 82.HG1224.383 3.731 216.738 5.185 7.645 82.HG2 223.453 3.198 201.054 5.87022.399

TABLE II Valine (-¹³(HD₂) Labeled Mouse Urinary Protein T1 and T2Relaxation Times in the Absence and Presence of Isopropylpyrazine LigandBound Protein Error Protein Error Change T1 values (milliseconds) 12.HG11724.922 26.602 1733.309 39.761 −8.387 12.HG2 1468.411 20.239 1509.37024.959 −40.959 47.HG1 1628.719 23.207 1626.298 23.067 2.421 47.HG2990.621 25.903 1067.605 23.922 −76.984 53.HG1 2233.271 38.470 2283.54338.045 −50.272 53.HG2 1364.726 21.978 1355.764 28.603 8.962 59.HG11282.763 21.852 1374.428 32.978 −91.665 59.HG1 1523.137 36.468 1572.79335.240 −49.656 70.HG1 2081.572 27.198 1872.254 27.668 209.318 70.HG21163.528 15.853 1077.097 22.081 86.431 82.HG1 914.901 15.870 925.10213.185 −10.201 82.HG2 669.801 11.195 668.810 12.161 0.991 T2 values(milliseconds) 12.HG1 153.108 2.257 161.437 3.401 −8.329 12.HG2 183.3142.276 192.512 3.174 −9.198 47.HG1 155.232 1.994 168.574 2.716 −13.34247.HG2 150.754 3.469 155.315 3.433 −4.561 53.HG1 163.699 2.187 177.8732.505 −14.174 53.HG2 150.283 2.148 158.149 3.377 −7.866 59.HG1 178.6482.787 195.222 4.380 −16.574 59.HG1 147.716 3.338 153.897 3.555 −6.18170.HG1 121.202 1.399 162.943 2.001 −41.741 70.HG2 162.033 2.083 150.1032.892 11.930 82.HG1 224.383 3.731 213.526 3.037 10.857 82.HG2 223.4533.198 203.529 3.774 19.924

Example 4 NMR Relaxation Measurements of [¹³C^(γ1),^(γ2)HD₂]-U-²H Valine

Samples of [¹³C^(γ1),^(γ2)HD₂]-U-²H valine enriched MUP both alone andcomplexed with 2-methoxy-3-isopropylpyrazine and2-methoxy-3-isobutylpyrazine were prepared from a single sample at pH7.0 and a protein concentration of 1 mM. A further sample of methyl-¹³C,²H enriched MUP was prepared at pH 7.0 and a protein concentration of0.5 mM. Longitudinal (R₁) and transverse (R₂) ¹³C relaxation rates weredetermined essentially as described by Ishima et al., J. Am. Chem. Soc.121:11589-11590, 1999. All spectra were recorded at a proton frequencyof 600 MHz and a probe temperature of 30° C. R₁ rates were determinedwith relaxation delay times of 16, 64, 128, 240, 400, 720, 1120, 1600,2240 and 2880 milliseconds. R₂ rates were determined with relaxationdelay times of 16, 32, 48, 64, 96, 160, 192, 240 and 288 milliseconds,and an effective field strength of 2 kHz. Relaxation data were fit to asingle exponential decay function I=I₀exp(−tR), where I₀ is the initialresonance intensity, t is the relaxation delay time and R is therelaxation rate (R₁ or R₂). Relaxation data were fitted to theLipari-Szabo model-free spectral density of the form:J(ω)=S²τ_(m)/(1+ω²τ² _(m))+(1−S²)τ_(i)/(1+ω²τ_(i) ²), where τ_(i)⁻¹=τ_(M) ⁻¹+τ_(e) ⁻¹, τ_(M) is the overall molecular tumblingcorrelation time and τ_(e) is the effective internal correlation time.See Lipari and Szabo, J. Am. Chem. Soc. 104:4546-4559, 1982. Thisequation is valid for fast internal motions, τ_(e)<<τ_(M), under whichconditions the order parameter for methyl CH dipolar relaxation is givenby S²=S_(axis) ²[P₂(cos θ_(H)]². S_(axis) ² is the order parameter ofthe methyl rotation axis and θ_(H) is the angle made by the axis and theCH bond vector. A global rotational correlation time of 8:57 nanosecondswas used for these calculations, according to the previously reportedvalue. Zidek et al., Nature Str. Biol. 6:1118-1121, 1999. Although thevaline residues were perdeuterated at non-methyl positions, the proteinwas otherwise protonated, and dipolar relaxation due to these “external”protons is not negligible for ¹³C relaxation. Thus, by analogy with thework of Ishima et al. J. Am. Chem. Soc. 123:6164-6171, 2001, thecontribution of these external protons to ¹³C R₂ values was estimated at25% from the X-ray coordinates of MUP. Consequently, measured ¹³C R²rates were multiplied by 0.75 to account for these external dipolarrelaxation processes.

FIG. 9A shows a region from the ¹³C—¹H HSQC spectrum of methyl ¹³C,²H-enriched mouse major urinary protein, containing valine Cγ1-Hγ1 andCy2-Hγ2 correlations. Significant overlap is present in the spectrum,arising from the combined effects of interference from resonances from¹³CH₃ isotopomers, together with methyl resonances derived from residuesother than valine. In contrast, an equivalent spectrum recorded on[¹³C^(γ1),^(γ2)HD₂]-U-²H valine enriched MUP (FIG. 9B) is essentiallyfree from resonance overlap, and all twelve valine methyl groups can beobserved and assigned. See Abbate et al., J. Biomol. NMR 15:187-188,1999.

MUP binds to the small hydrophobic ligands,2-methoxy-3-isopropylpyrazine and 2-methoxy-3-isobutylpyrazine, whichbind to MUP with Kd's of 560 nM and 80 nM, respectively. FIG. 9C shows¹³C—¹H HSQC spectra of complexes of [¹³C^(γ1),^(γ2)HD₂]-U-²Hvaline-enriched MUP with these ligands. Significant shift perturbationswere observed for the correlations from Val 82 γ1 and γ2. This wasanticipated since Val 82 is located within the binding pocket of MUP.Timm et al, Prot. Sci. 10:997-1004, 2001. In both complexes allcorrelations were well resolved, in contrast to complexes withmethyl-¹³C, ²H enriched MUP (data not shown). Under these circumstances,measurement of ¹³C longitudinal and transverse relaxation rates (R₁ andR₂) can be undertaken in each sample at high sensitivity and withminimal interference due to resonance overlap. Relaxation curves werecreated. Typical results are shown in FIG. 10. TABLE III S² Values forValine Methyl Groups Derived from 600 MHz ¹³C T₁ and T₂ Measurements onMajor Urinary Protein Selectively Enriched with [¹³C^(γ1),^(γ2)HD2]-U-²H Valine. Ligand 2-methoxy-3- 2-methoxy- isopropyl3-isobutyl- None pyrazine pyrazine Residue S² _(axis)* S² _(axis) S²_(axis) Val-12 C^(γ1) 0.88 0.90 0.86 Val-12 C^(γ2) 0.77 0.69 0.66 Val-47C^(γ1) 0.91 0.84 0.86 Val-47 C^(γ2) 0.86 0.90 0.88 Val-53 C^(γ1) 0.740.80 0.78 Val-53 C^(γ2) 0.90 0.85 0.86 Val-59 C^(γ1) 0.73 0.68 0.63Val-59 C^(γ2) 0.98 0.86 0.90 Val-70 C^(γ1) 0.78 0.85 0.81 Val-70 C^(γ2)0.77 0.77 [1.03] Val-82 C^(γ1) 0.53 0.50 0.53 Val-82 C^(γ2) 0.46 0.480.53*The average error in the reported S²axis values 0.045 +/− 0.009. Thebracketed value is anomalous due to strong coupling between Val-70C^(γ2) and C^(γ1).

Values of S² _(axis) were derived from R₁ and R₂ data for each methylgroup using the model-free spectral density function given above, andare listed in Table III. There were only minor changes S_(axis) ² forany of the six valine residues in MUP upon binding of either2-methoxy-3-isopropylpyrazine or 2-methoxy-3-isobutylpyrazine, most ofwhich are within experimental error. An exception is S_(axis) ² forVal-70 C^(γ)2, which appears to increase dramatically on binding2-methoxy-3-isobutylpyrazine. However, this is an anomalousresult—Val-70 C^(γ)2, which appears to increase dramatically on binding2-methoxy-3-isobutylpyrazine. However, this is an anomalousresult—Val-70 C^(γ)1 and C^(γ)2 possess similar shifts in bothcomplexes, and differ by ˜3 Hz in the 2-methoxy-3-isobutylpyrazinecomplex, under which conditions C^(γ)1 and C^(γ)2 will be stronglycoupled via the two bond homonuclear scalar coupling. This results in apoor fit of R2 relaxation data for Val-70 C^(γ)2 (χ²=49.7).

Measured S² _(axis) values for the two methyl groups of certain valineresidue are not identical within experimental error. At first sight thisis inconsistent with the requirement for their mobilities to beessentially be the same, since they form part of the same isopropylgroup. However, this is not necessarily reflected in equivalent S²_(axis) values. Order parameters of methyl groups from the sameisopropyl group may differ if the effective averaging axis for thisgroup makes different angles with the methyl threefold axes.

The constant values of S_(axis) ² on complex formation for the methylgroups of Val-82 is unexpected since this residue is located within thebinding pocket and is proximal to the bound ligands. Intuitively, therelevant S_(axis) ² values might be expected to increase due todecreased mobility within the binding pocket in the complexes. Sinceeach ligand is protonated, additional relaxation pathways offered byligand protons are offset by an equal opposite effective contributiondue to increased mobility of Val-82 side-chain atoms. In contrast,remaining valine methyl groups are distal to the binding pocket, andligand protons will make a negligible contribution to relaxation inthese cases, since no significant structural changes in MUP are observedin the crystal structures of these complexes (data not shown).

Previous studies using ²H relaxation methods on calcium-saturatedcalmodulin in complex with a peptide model of the calmodulin-bindingdomain have highlighted significant changes in S_(axis) ² values thatcannot be predicted in a rational manner. Changes to S_(axis) ² valueswere found at locations both proximal and distal from the Ca²⁺ bindingsite. The present observations suggest that this phenomenon may not begeneral. Indeed, the dominant entropic contribution to binding fromprotein dynamics may derive from the backbone, as has been suggested forthe complex between MUP and a small-molecule ligand unrelated to thepyrazine-derivatives described here. Zidek et al., Nature Str. Biol.6:1118-1121, 1999. Computations of S_(axis) ² values for each of thevaline methyl groups both in the free protein and in complexes with2-methoxy-3-isopropylpyrazine and 2-methoxy-3-isobutylpyrazine indicatethat valine side-chains do not contribute significantly to the entropyof binding in this case. Since the methods according to the inventionare highly sensitive, lower (10-20%) levels of ¹³C-enrichment may beused. Levels of about 5-30% are appropriate, or levels of about 10-20%and most preferably about 10%.

1. An amino acid wherein at least one bond vector in the side chain ofsaid amino acid consists of two NMR-active nuclei bonded together andwherein essentially all other nuclei are NMR inactive.
 2. An amino acidwherein at least one bond vector in the side chain of said amino acidconsists of two NMR-active nuclei bonded together and whereinessentially all other bond vectors are NMR inactive.
 3. The amino acidof claim 1 or claim 2 wherein said two NMR-active nuclei are ¹³C and ¹H,wherein the remainder of the carbon atoms in said amino acid areessentially ¹²C, wherein the nitrogen atoms in said amino acid areessentially ¹⁴N and wherein the remainder of the hydrogen atoms in saidamino acid are essentially ²H.
 4. The amino acid of claim 1 or claim 2wherein said two NMR-active nuclei are ¹³C and ¹H, wherein the remainderof the carbon atoms in said amino acid are essentially ¹²C, wherein thenitrogen atoms in said amino acid are essentially ¹⁴N and wherein theremainder of the hydrogen atoms in said amino acid are naturalabundance.
 5. The amino acid of claim 1 or claim 2 wherein said twoNMR-active nuclei are ¹⁵N and ¹H, wherein the remainder of the nitrogenatoms in said amino acid are essentially ¹⁴N, wherein the carbon atomsin said amino acid are essentially ¹²C and wherein the remainder of thehydrogen atoms in said amino acid are essentially ²H.
 6. The amino acidof claim 1 or claim 2 wherein said two NMR-active nuclei are ¹⁵N and ¹H,wherein the remainder of the nitrogen atoms in said amino acid areessentially ¹⁴N, wherein the remainder of the carbon atoms in said aminoacid are essentially ¹²C and wherein the remainder of the hydrogen atomsin said amino acid are natural abundance.
 7. A culture medium comprisingan amino acid of claim 1 or claim
 2. 8. A protein comprising at leastone amino acid of claim 1 or claim
 2. 9. A method for analyzing thedynamics of a bond vector of a protein comprising producing said proteinin a form which comprises an amino acid of claim 1 or claim 2 andsubjecting said protein to NMR spectroscopy.
 10. A method of determiningthe entropic contribution of a bond vector of a protein bound to aligand comprising producing said protein in a form which comprises anamino acid of claim 1 or claim 2 and subjecting said protein to NMRspectroscopy in the presence and the absence of said ligand.
 11. Amethod of preparing an isotopically substituted protein comprisingculturing cells that express said protein in a medium containing atleast one amino acid according to claim 1 or claim 2 and recovering saidprotein from the cell culture.